Generalizing Association Rules to Ordinal Rules
نویسندگان
چکیده
The development of good measures of interestingness of the discovered rules is one of the important problems in data mining. Such measures of interestingness are divided into objective measures : – those that depend only on the structure of a rule and the underlying data used in the discovery process, and the subjective measures – those that depend on the class of users who examine the rule. However, most objective measures are suitable for binary attributes and require an appropriate transformation of the initial set of attributes into binary attributes for all unsupervised usual algorithms for the discovery of association rules. As a result, the complexity of these algorithms increases exponentially with the number of attributes, and this transformation can lead us, on the one hand to a combinatorial explosion, and on the other hand to a prohibitive number of weakly significant rules with many redundancies. Moreover, the few measures suitable for numeric attributes, like for example correlation coefficient, are not selective. In this paper, we propose a new objective measure, called ordinal intensity of implication, which generalizes intensity of implication suitable for binary attributes and which evaluates whether the number of transactions not clearly verifying rule X→Y (i.e., the number of transactions containing a high value for attribute X and a low value for attribute Y) is significantly small as compared to a random draw. We finish the study with an evaluation on banking data and show some discovered ordinal rules, and connection between data / information and quality.
منابع مشابه
An Algorithm for the Discovery of Arbitrary Length Ordinal Association Rules
rule mining techniques are used to search attribute-value pairs that occur frequently together in a data set. Ordinal association rules are a particular type of association rules that describe orderings between attributes that commonly occur over a data set [9]. Although ordinal association rules are defined between any number of the attributes, only discovery algorithms of binary ordinal assoc...
متن کاملApplying Ordinal Association Rules for Cleansing Data With Missing Values
Cleansing data of errors is an important processing step particularly when integrating heterogeneous data sources. Dirty data files are prevalent in data warehouses because of incorrect or missing data values, inconsistent attribute naming conventions or incomplete information. This paper improves the data cleansing ordinal association rules technique by proposing a solution for the missing val...
متن کاملQuantitative and Ordinal Association Rules Mining (QAR Mining)
Association rules have exhibited an excellent ability to identify interesting association relationships among a set of binary variables describing huge amount of transactions. Although the rules can be relatively easily generalized to other variable types, the generalization can result in a computationally expensive algorithm generating a prohibitive number of redundant rules of little signific...
متن کاملSome Quality Measures for Fuzzy Association Rules
Several approaches generalizing crisp association rules to fuzzy association rules have been proposed. In an our previous paper we introduced a pair of confidence measures for crisp association rules from which one can be obtained the majority known quality measures. In this paper, starting from these results we give an extension to fuzzy association rules.
متن کاملTowards Association Rules with Hidden Variables
The mining of association rules can provide relevant and novel information to the data analyst. However, current techniques do not take into account that the observed associations may arise from variables that are unrecorded in the database. For instance, the pattern of answers in a large marketing survey might be better explained by a few latent traits of the population than by direct associat...
متن کامل